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Abstract — In 1985 Kaspi provided a single-letter characteriza- 
tion of the sum-rate-distortion function for a two-way lossy source 
coding problem in which two terminals send multiple messages 
back and forth with the goal of reproducing each other's sources. 
Yet, the question remained whether more messages can strictly 
improve the sum-rate-distortion function. Viewing the sum-rate 
as a functional of the distortions and the joint source distribution 
and leveraging its convex-geometric properties, we construct an 
example which shows that two messages can strictly improve the 
one-message (Wyner-Ziv) rate-distortion function. The example 
also shows that the ratio of the one-message rate to the two- 
message sum-rate can be arbitrarily large and simultaneously 
the ratio of the backward rate to the forward rate in the two- 
message sum-rate can be arbitrarily small. 

I. Introduction 

Consider the following two-way lossy source coding prob- 
lem studied in |T]. Let (X(l), 7(1)), . . ., (X(n), F(ra)) be n iid 
samples of a two-component discrete memoryless stationary 
source with joint pmf pxy(x,y), (x,y) e X x J/, \X x J/| < oo. 
Terminal A observes X := (X(l), . . . ,X(ri)) and terminal B 
observes Y := (F(l), . . . , Y(n)), Terminal B is required to 
produce X := (X(l), . . ., X(nJ) e X", where A' is a reproduction 
alphabet with \X\ < oo, such that the expected distortion 
E[d ( "\X, X)] does not exceed a desired level, where 



1 " 

d { "\x,x) := - V d(x(i), x(i)), 
n 4-f 



i=\ 

and d : X x X — > K. + Uf°°i is a per-sample (single- 
letter) distortion function. Terminal A is likewise required 
to reproduce the source observed at terminal B within some 
distortion level with respect to another (possibly different) 
distortion function. To achieve this objective, the terminals 
are allowed to send a certain number of messages back and 
forth where each message sent from a terminal at any time only 
depends on the information available at the terminal up to that 
time. In HI, Kaspi provided a single-letter characterization 
of the sum-rate-distortion function for any finite number of 
messages. Yet, whether more messages can strictly improve 
the sum-rate-distortion function was left unresolved. If the 
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goal is to reproduce both sources losslessly at each terminal 
(zero distortion) then there is no advantage in using multiple 
messages; two messages are sufficient and the minimum sum- 
rate cannot be reduced by using more than two messages!! If, 
however, the goal is changed to losslessly compute functions of 
sources at each terminal, then multiple messages can decrease 
the minimum sum-rate by an arbitrarily large factor Q, 
HJ. Therefore, the key unresolved question pertains to lossy 
source reproduction: can multiple messages strictly decrease 
the minimum sum-rate for a given (nonzero) distortion? This 
question is unresolved even when only one source needs to be 
reproduced with nonzero distortion. 

In this paper, we construct the first example which shows 
that two messages can strictly improve the one-message 
(Wyner-Ziv) rate-distortion function. The example also shows 
that the ratio of the one-message rate to the two-message sum- 
rate can be arbitrarily large and simultaneously the ratio of the 
backward rate to the forward rate in the two-message sum- 
rate can be arbitrarily small. The key idea which enables the 
construction of this example is that the sum-rate is afunctional 
of the distortion and the joint source distribution which has 
certain convex-geometric properties. 

II. Problem setup and related prior results 

A. One-message Wyner-Ziv rate-distortion function 

Definition 1: A one-message distributed source code with 
parameters («, |A1|) is the tuple (e ( "',^ (M) ) consisting of an 
encoding function e ( "' : X" — > M and a decoding function 
g {n) . yn x M X n The output of g («) > deno ted by X, is 

called the reproduction and (l/«)log 2 \M\ is called the block- 
coding rate (in bits per sample). 

Definition 2: A tuple (R, D) is admissible for one-message 
distributed source coding if, Ve > 0, 3 n(e) such that 
V« > n(e), there exists a one-message distributed source code 
with parameters («, \Ai\) satisfying i log 2 \M\ < R + e, and 
E[d {n \X,X)] <D + e. 

The set of all admissible (R,D) tuples in Definition [2] is a 
closed subset of K 2 . For any D e R, the minimum value of R 

2 If only one of the sources is required to be losslessly reproduced at the 
other terminal then one message is sufficient and the minimum sum-rate 
cannot be improved by using more than one message. However, if X and 
Y are nonergodic, two-way interactive coding can be strictly better than one- 
way non-interactive coding (2]- 



such that (R,D) is admissible is the one-message Wyner-Ziv 
rate-distortion function [5| and will be denoted by R sum i(D). 
The following single-letter characterization of R sa m,i(D) was 
established in 0: 



Rsum,i(D) = min I(X; U\Y), 

p m ,g; B[d(X,g(UJ))]<D 



(2.1) 



where U e H is an auxiliary random variable such that U - 
X- Y is a Markov chain and \<U\ < \X\ + 1, and g : 1/xJ/ -> X 
is a deterministic single-letter decoding function. 

B. Two-message sum-rate-distortion function 

Definition 3: A two-message distributed source code with 
parameters (n, \M\\, IAI2I) is the tuple (e<" ,e\ ? ,g^) consisting 



of encoding functions e ( "> : J/" -> Mi, e™ : X" x Mi -> M 2 
and a decoding function : J/" x Afi x AI2 -» <Y". The 
output of g(' l \ denoted by X, is called the reproduction and 
for i = 1,2, (l/n)log 2 \Mi\ is called the ;-th block-coding rate. 

Definition 4: A tuple (Ri,R2,D) is admissible for two- 
message distributed source coding if, Ve > 0, 3 n(e) such that 
Vrc > h(e), there exists a two-message distributed source code 
with parameters («, |Ati|, IWI2I) satisfying j- log 2 \Mi\ < Ri + e, 
for i = 1,2, and E[</ (n) (X, X)] < D + e. 

The rate-distortion region, denoted by 'RD, is defined as the 
set of all admissible (Ri,R2,D) tuples and is a closed subset 
of R 3 . For any Del, the minimum value of (R\ + R2) such 
that (R[,R2,D) e r RT> is the two-message sum-rate-distortion 
function and will be denoted by R SU m,2(D). The following 
single-letter characterization of r RT> was established in (TJ: 

KD = { (Ri,R 2 ,D) I 3 p Vl \Y,Pv 2 \x Vl ,8,s.t. 
Ri >I{Y;V X \X), 
R 2 >KX;V 2 \Y,V l ), 
E[d(X,g(V u V 2 ,Y))]<D }, 



(2.2) 



where Vi e r V\ and V2 € ^2 are auxiliary random variables 
with bounded alphabets]! suc h that the Markov chains V\ - 
Y - X and V 2 - (X, Vi) - Y hold, and g : % x ^V 2 x J/ A" 
is a deterministic single-letter decoding function. From (12.21 i, 
it follows that 



Rsitm.liD) 



min 

PV 1 IY,PV 2 \XV 1 ,g- 

E[d(X,g(VuV 2 ,Y))]<D 



[I(X\ Vi\X) + I(X;V 2 \Y,V0l 



(2.3) 

Since a one-message code is a special case of a two-message 
code with \M\\ = 1, the inequality R sllm , 2 (D) < R sum ^{D) 
holds for all Del. Even though the single-letter character- 
izations of R SU m,i(D) and R sum ,2{D) are known, it has proved 
difficult to demonstrate the existence of pxy, d, and D such 
that R sumt 2(D) < R smn ,i{D). In the distributed source coding 
literature, to the best of our knowledge, there is neither an 
explicit example which shows that R sum ^{D) < R SU m,\{D) nor 
an implicit proof that such an example must exist nor a proof 
that there is no such example. In this paper we will construct 
an explicit example for which R sw ,i,2(D) < R sum ,\{D). 

3 Bounds for the cardinalities of r V\ and l^j can be found in Q]. 



In (6), 0, for a general t e Z + , we established a connection 
between the f-message sum-rate-distortion function and the 
(t - l)-message sum-rate-distortion function using the rate 
reduction functional defined in the next subsection. This 
connection and the properties of the rate reduction functional 
allows one to compare R S u m ,2(D) and R SU m,i(D) without having 
to explicitly solve the optimization problem in ( 12.3b . 

C. Key tool: rate reduction functionals 

Generally speaking, for i = 1,2, R sum ,i depends on 
(pxy,d,D). As in J6), Q, we fix d and view R SU m,i as a 
functional of (pxy, D). The sum-rate needed to reproduce only 
terminal A's source at terminal B with nonzero distortion 
can only be smaller than the sum-rate needed to losslessly 
reproduce both sources at both terminals which is equal to 
H(X\Y) + H(Y\X). The reduction in the rate for lossy source 
reproduction in comparison to lossless source reproduction of 
both sources at both terminals is the rate-reduction functional. 
Specifically, the rate reduction functionals J7] are defined as 
follows. For i = 1,2, 

Pi (p XY , D) := H(X\Y) + H(Y\X) - R sum , t (pxY, D). (2.4) 

Since R sllm ,\ ^ Rsum,2 and p\ < p2 always hold, 
R sum ,i(PxY,D) > R sum ,2(pxY,D) if, and only if, pi{p XY ,D) < 
P2(Pxy,D), i.e., if, and only if, pi(p X Y,D) + P2(Pxy,D). The 
following key lemma provides a means for testing whether or 
not p\ = p2 without ever having to evaluate or work with p2, 
i.e., without explicitly constructing auxiliary variables Vi, V2 
and the decoding function g in ( 12.3b . 

Lemma 1: The following two conditions are equivalent: (1) 
For all pxy and D, pi(p X Y,D) = p 2 (pxY,D). (2) For all p X \ Y , 
Pi(Px\yPy,D) is concave with respect to (py,D). 

In simple terms, p\ = p2 if, and only if, p\ is concave 
under T-marginal and distortion perturbations. The proof of 
Lemma Q] is along the lines of the proof of part (i) of 
Theorem 2 in Q and is omitted. In fact, it can be proved that 
if for some t e Z + , the f-message rate-reduction functional 
is identically equal to the (t + l)-message rate reduction 
functional, i.e., p, = p /+ i, then p, = p m , the infinite-message 
rate-reduction functional. As discussed in Q Remark 6], 
LemmaQ]does not hold if all the rate reduction functionals are 
replaced by the sum-rate-distortion functionals. Therefore the 
rate reduction functional is the key to the connection between 
a one-message distributed source coding scheme and a two- 
message distributed source coding scheme. 

The remainder of this paper is organized as follows. In The- 
orem [T] we will use Lemma [T] to show that there exist pxY,d, 
and D for which R sum ,\(pxY, D) > R sllm ,2(pxY,D). We will do 
this by (i) choosing px\Y so that X and Y are symmetrically 
correlated binary random variables with P(F + X) = p, (ii) 
taking d(x, x) to be the binary erasure distortion function, (iii) 
selecting a value for D, and (iv) showing that Pi(px\yPy, D) 
is not concave with respect to py- By Lemma [T] this would 
imply that p\{pxy,D) + P2(Pxy,D) which, in turn, would 
imply that R slmu i(p X y,D) > R sum , 2 (pxy,D). In Theorem |2] we 
will show that for certain values of parameters p and D, the 



two-message sum-rate can be split in such a way that the 
ratio R\/Rz is arbitrarily small and simultaneously the ratio 
Rsum,\l(R\ + Ri) is arbitrarily large. This will be proved by 
explicitly constructing auxiliary variables V\ , Vn and decoding 
function g in ( 12.3b . While the explicit construction of V\, Vo 
and g in the proof of Theorem [2] may make the implicit proof 
of Theorem Q] seem redundant, it is unclear how the explicit 
construction can be generalized to other families of source 
distributions and distortion functions. The approach followed 
in the proof of Theorem [T] on the other hand, provides an 
efficient method to test whether the best two-message scheme 
can strictly outperform the best one-message scheme for more 
general distributed source coding and function computation 
problems. The implicit proof naturally points to an explicit 
construction and was, in fact, the path taken by the authors to 
arrive at the explicit construction. 

III. Main results 

Theorem 1: There exists a distortion function d, a joint 
distribution pxy, and a distortion level D for which 

Rsum,\(PXY,D) > R sllm ,2(PXY,D). 

Proof: In the light of the discussion in Section III-CI to 
prove Theorem Q] it is sufficient to show there exist px\y, d, 
and D for which p\{px\YPY,D) is not concave with respect to 
Py- In particular, it is sufficient to show that there exist py,i 
and py,i such that 



/ PY,l +PY2 _\ 

Pi \Px\y ,DJ 



Pi (px\yPyi,D) +pi (px\yPy,2,D) 



(3.5) 

Let X = J/ = {0, 1), and X = {0, l,e). Let d be the binary 
erasure distortion function, i.e., d : {0, 1J x {0, e, lj — > {0, 1, oo} 
and for i = 0,1, d{i,i) = 0, d(i, 1 — i) = oo, and d(i,e) = 1. 
Let p^i(l) = 1 - Pya(0) = priiO) = 1 - PYiiX) = q, i-e., 
Pya = Bernoulli(g) and py,2 = Bernoulli®!] Let px\Y be 
the conditional pmf of the binary symmetric channel with 
crossover probability p, i.e., Px\y(1\Q) = />x|f(0|1) = p- Let 
Py ■- (Py,i + Py,2)/2 which is Bernoulli(l/2). The joint 
distribution pxy = PyPx\y is the joint pmf of a pair of doubly 
symmetric binary sources (DSBS) with parameter p, i.e., if p xy 
denotes p X Y(x,y), then poo = p u = p/2 and poi = Pw = p/2. 
For these choices of px\Y, Pyi, Py,2, Py, and d, we will 
analyze the left and right sides of (13.51 step by step through 
a sequence of definitions and propositions and establish the 
strict inequality for a suitable choice of D. The proofs of all 
the propositions are given in Section [IV] 
• Left-side of (\3.5l : From (12. Il l and ( 12.4b we have 



pi(p X Y,D) 



max {H(X\Y,U) + H(Y\X)}. (3.6) 

p m ,g. E[d(X,g(U,Y))]<D 



For the binary erasure distortion and a full support joint source 
pmf taking values in binary alphabets, ( 13.6b simplifies to the 
expression given in Proposition Q] 

Proposition!: If X = J/ = {0, lj, supply) = {0, l} 2 , 
d is the binary erasure distortion, and D e R, then p\ = 

4 For any a e [0, 1], a := 1 - a. For the erasure symbol e, e := e. 



max Pulx (H(X\Y, U) + H(Y\X)\ where U = [0,e, 1} and 



Pu\x(u\x) = 



CtOe, 


if x — 0, u = 


e. 


1 - ao e , 


if x — 0, u = 


0. 


ai e , 


if x = 1, u = 


e, 


1 - a\ e , 


if x — l,u = 


1. 


0, 


otherwise, 





(3.7) 



where aro e ,a;i e € [0,1] satisfy E[rf(X, U)] = px(0)ao,; + 
PxWau < D. 

The expression for p\ further simplifies to the one in 
Proposition [2] by using pu\x given by (13.7b in ( 13.6b . 

Proposition 2: If X = & = {0, 1), supp(/? xy ) = {0, l} 2 , d is 
the binary erasure distortion, and DeR, then 



pi(p X Y,D) = 



max 

or «,ffi e e[0,l]: 
4>(PxY,aoe,a\ e )<D 



ifripxY, aoe, a\e), 



(3.8) 



where 

*KPXY, OlOe, OS\ e ) 

:= (poOCOe + Pl0«le)^ 

+(po\aoe + pua\e)h 
Poo 



POOOlQe 



Hpoo + Po\)h 



POQCQe + PwOlle 

Poiaoe 
Poiaoe + Pnai 

(pn+ pw)h 



Pu 



Pu + Pio 



\P00 + Poi, 

4>(pxY,a 0e ,aie) :- px(0)ao e + p x {\)au, and h is the binary 
entropy function: h{6) :- -6\og 2 6 - #log 2 6, e [0, 1]. 

Finally, for a DSBS with parameter p and the binary 
erasure distortion, pi reduces to the compact expression in 
Proposition [3] 

Proposition 3: If d is the binary erasure distortion, D e 
[0, 1], and pxy is the joint pmf of a DSBS with parameter p, 
then 

p 1 (p X Y,D) = (l+D)h(p). (3.9) 



• Right-side of ( I5.5I ): Solving the rate reduction functionals in 
the right-side of (13.5b requires solving the maximization prob- 
lem ( 13.8b for asymmetric distributions Px\yPy,\ and Px\yPy,2- 
Exactly solving this problem is cumbersome but it is easy to 
provide a lower bound for the maximum as follows. 

Proposition 4: If d is the binary erasure distortion, py,\ is 
Bernoulli(g), py,2 is Bernoulli®, and px\Y is the conditional 
pmf of the binary symmetric channel with crossover probabil- 
ity p, then the inequality 

P\(px\YPY,i,D)+p 1 (px\YPY2,D) 

> C(p, q, aoe, 1) (3.10) 

holds for D = rj(p,q,ao e , 1), where 

C(p,q,a 0e ,a le ) := ifr(p X \YPY,ua 0e ,a le ), 
i](p,q,a 0e ,a le ) := (f>(p X \YPY,i,a Qe ,a le ). 

Remark 1: The rate-distortion tuple (H(X\Y) + H(Y\X) - 
C(p,q,ao e ,l),ri(p,q,ao e ,l)) is admissible for one-message 
source coding for joint source distribution Px\yPy,\ and cor- 
responds to choosing pu\x given by ( 13.71 ) with a\ e = 1 and the 
decoding function g(u,y) = u. Since this choice of pu\x and g 



may be suboptimal, C(p,q,ao e , 1) is only a lower bound for 
the rate reduction functional. 

• Comparing left and right sides of ( I3.5D : The left-side of 
( 13.5b and the lower bound of the right-side of ( 13.5b can be 
compared as follows. 

Proposition 5: Let d be the binary erasure distortion, py 
be Bernoulli(l/2), and px\y be the binary symmetric channel 
with parameter p. For all q e (0, 1 /2) and all ao e & (0, 1), there 
exists p e (0,1) such that the strict inequality pi(pxy,D) < 
C(p, q, ao e , 1) holds for D = T](p, q, aso e , 1)- 

Since the left-side of ( 13.5b is strictly less than a lower bound 
of the right-side of ( 13.5b . the strict inequality ( 13.5b holds, which 
completes the proof of Theorem Q] ■ 

Theorem [2] quantifies the multiplicative reduction in the 
sum-rate that is possible with two messages. 

Theorem 2: If d is the binary erasure distortion and pxy 
the joint pmf of a DSBS with parameter p, then for all L > 
there exists an admissible two-message rate-distortion tuple 
(R l ,R 2 ,D) such that R sum ,\(p X Y, D)/(Rx +R 2 ) > L wdR x IR 2 < 
l/L. 

Proof: We will explicitly construct Pvt\Y, Pv 2 \xv,^ and 8 
in ( 12.2b which lead to an admissible tuple (R[,R2,D). Let 
pv,\y be the conditional pmf of the binary symmetric channel 
with crossover probability q. Let the conditional distribution 
Pv 2 \xvi(v 2 \x, vi) have the form described in Table [Q and let 
g(vi,v 2 ,y) := v 2 . 

TABLE I 
Conditional distribution pv 2 \xv\ 



PV^\XV\ 


v 2 = 


Vz = e 


v 2 = 1 


x = 0, vi = 


1 - a 


a 





x = l,vi =0 





1 





x = 0, vi = 1 





1 





x= l,vi = 1 





a 


1 - a 



The corresponding rate-distortion tuple can be shown to 
satisfy the following property. 

Proposition 6: Let d be the binary erasure distortion and 
let pxy be the joint pmf of a DSBS with parameter p. For 
Pv 1 \y,Pv 2 \xv ] , and g as described above, and all L > 0, there ex- 
ist parameters p, q, a such that the two-message rate-distortion 
tuple (R U R 2 ,D) given by R[ = I(Y; V X \X), R 2 = I(X; V 2 \Y, V{), 
D = E[d(X,V 2 )] satisfies R mm ,\{pxY,D)l{R x + R 2 ) > L and 
RJR 2 < l/L. 

This completes the proof of Theorem [2] ■ 
The conditional pmfs pv { \y and pv 2 \xv\ in the proof of 
Theorem [2] are related to the conditional pmf pu\x in the proof 
of Theorem Q] as follows. Given Vi = 0, the conditional dis- 
tribution pxYV 2 \vM^^ v 2|0) = PY,\(y)Px\Y(Ay)Pu\x(vi\x), where 
Pu\x is given by (13.7b with ao e = a and ai e = 1. Given 
Vi = 1, the conditional distribution pxYV 2 \v,(x,y, v 2 \l) = 
PY2(y)Px\Y(x\y)pu\x(v2\x), where p m is given by (ETJl with 
a\ e = a and ao e = 1- Conditioning on V\, in effect, 
decomposes the two-message problem into two one-message 
problems that were analyzed in the proof of Theorem \T\ 

IV. Proofs 

Proof of Proposition^ Given a general pu\x and g satisfying 
the original constraint in ( 13.6b . we will construct U* satisfying 



the stronger constraints in Proposition Q] with an objective 
function that is not less than the original one as follows. 

Without loss of generality, we assume supply) = U. For 
i = 0, 1, let Hi :={ueU: p X \u(¥) = !}■ Let <U e := {u e H : 
Px\uWu) e (0, 1)}. Then {U u t(o,W e } forms a partition of V. 
For each u 6 14 e , since pxY\u(x,y\u) > for all (x,y) 6 {0, l} 2 , 
it follows that g(u,y — 0) = g(u,y = 1) = e must hold, because 
otherwise E(d(X,g(U, Y))) - oo. But for every u e Hi, i = 0, 1, 
g(u,y) may equal i or e but not (1-z) to get a finite distortion. 
When we replace g by 

, i, if ue%li,i = 0,1, 

? (u,y) 



e, if u e 

the distortion for u e Hi,i = 0,1, is reduced to zero, and 
the distortion for u e 14 e remains unchanged. Therefore we 
have E(d(X,g*(f/,T))) < E(d(X,g(U, Y))) < D. Note that 
g*{U, Y) is completely determined by U. Let U* := g*(U, Y). 
Then U* = i iff U e 1/,,/ = {0,1, e}. The objective 
function H(X|T,t/) + H(T|X) = H(X\Y, U, U") + H(Y\X) < 
H(X\Y, U") + H(Y\X), which completes the proof. ■ 

Proof of Proposition \3} 

For a fixed pxy, H{X\Y, U) + H(Y\X) is concave with respect 
to Pxyu and therefore also pu\x- Since pu\x is linear with 
respect to (ao e , a\ e ), if/{_pxY, a?o e , a\e) = H(X\Y, U) + H(Y\X) is 
concave with respect to (ao e , a\ e ). 

The maximum in ( 13.81 ) can be achieved along the axis of 
symmetry given by a\ e - ao e because (i) tfr and <p are both 
symmetric with respect to ao e and a\ e , i.e., if/(pxy,aQ e ,ai e ) = 
iJ/(Pxy, a le , a 0e ) and <p(p X Y, ao e , u\e) = <P(Pxy, a le , a Qe ), and (ii) 
iff(pxY, a?Q e , a\ e ) is a concave function of {ao e , a\ e ). When D e 
[0, 1], p\ can be further simplified as follows. 

Pi(Pxy,D) = max i//(p X Y, a 0e , a\e) = (1 + D)h(p), 

»0f=o'l?£[0,D] 

which completes the proof. ■ 
Proof of Proposition^ For the joint pmf px\yPyi summarized 

TABLE II 
Joint distribution px]YPr,l 



Px\yPy,i 


y = 


y= i 


x = 




pi 


x = 1 


pq 


Pi 



in Table [TT] functions iff and n simplify even further to special 
functions of (p,q,ao e ,ai e ) as follows: 



C(p, q, a 0e , a le ) 



l/s(PX\YPY,l,a0e,aie) 



q(pa 0e + pa le )h 
+q(pa 0e + pa\ e )h 
+{pq + pq)h 
+(pq + pq)h 



pa 0e 



paoe + pan 
paoe 
paoe + pan 



pq + pq 
pq 



T](p,q,a 0e ,a le ) = 



[pq + pq) 
<P(Px\YPY,\,a ae ,a ie ) 

{pq + pq)a 0e + (pq + pq)a U - 



(4.11) 



Observe that C{p,q,ao e ,a\ e ) - C(p,q,a\ e ,ao e ), and 
rj(p, q, aoe, £*ie) = T](p, q, oc\ e , aoe) hold. Therefore we have 



P\(Px\yP y,2,D) 



max C(p, q, ao e , a\ e ) 

a e_,a le £[0,l]: 
i](p,q,a 0r ,aie)<D 

max C(p, q, a le , a 0e ) 

a e,ai e £[0,\]: 
r)(p,q,a le ,a 0e )<D 

= P\(Px\yPy,\,D). 

It follows that 

P\{Px\yPy,\,D) + p\(px\YPY2,D) 

2 = P\KPx\yPy,\,D) 

> C(p,q,a 0e ,l) 

holds for D = r/(p, q, ao e , 1). ■ 
Proof of Proposition [5} 

Since D = rj(p,q,aQ e ,l) E [0,1] always holds, we 
have pi(p XY ,D) = (1 + D)h(p) due to $5M . We will 
show that for any fixed q e (0,1/2) and ao e e (0,1), 
lim p ^oC(p,q,ao e ,l)/h(p) > lim p ^o(l + D) holds, which 
implies that 3p e (0, 1) such that C(p, q, a Uc , i)/h(p) > (1 +D), 
which, in turn, implies Proposition [5] It is convenient to use 
the following lemma to analyze the limits. 

Lemma 2: Let f(p) be a function differentiable around p = 
such that /(0) = and /'(0) > 0. Then 

fi m ^» =/(0 ) 

Proof: Applying the l'Hopital rule several times, we have 



lim 



Kf(p)) 



p->o h(p) 



In(l-/ (P ))-In/ (P ) 
p->o ln(l -p)-\np J 

lim^/'(0) 
p->0 In p 

P 



lim ■ 

P-+0 f(p) 

/'(0), 



(/'(0)) 2 



which completes the proof of Lemma [2] 
Applying Lemma [2] we have 

C(p,q,a 0e , 1) 
hm — = 2 - #(1 - a 0e ), 



(4.12) 
(4.13) 



/>->o h( p ) 

lim(l + D) = 2 - (?(1 - a 0e ), 

p-»0 

(C(p,q,a 0e A) , _.\ „ V1 . 

p^o\ h(p) j 

Therefore for any a(, e e (0, 1) and q e (0, 1/2), there exists 
a small enough p such that C(p, q, a^,, 1) > (1 + D) holds, 
which completes the proof. ■ 

Proof of Proposition |6} 

For the rate-distortion tuple {R\,R2,D) corresponding to 
the choice of pv { \Y, Pv 2 \xVi an d g described in the proof of 
Theorem we have (i) R x = I(Y; V X \X) = H(Y\X) - C 2 (p,q), 
where C2(p,q) is the sum of the last two terms in (14. lit ; 



(ii) R 2 = I(X; V 2 \Y,Vi) = 2h(p) - C(p, q, a, 1) - R\\ and (iii) 
D = j](p, q, a, 1). It follows that 

lim —L = 0, 

p^o h(p) 

R 2 C(p,q,a,l) Ri 

hm — — = 2 - hm — ■ hm —— = q{\ - a). 

p^oh(p) P -»o h(p) P^oh(p) 

Therefore for all q > and a e (0, 1), we have 

lim— =0. (4.14) 

P ^o R 2 

For the one-message rate-distortion function, we have 
R S um,\{pxY,D) = 2h{p)-pi{p X Y,D), where p\{p X Y,D) is given 
by (13.9b . Therefore we have 

,. R SU m,l(PXY, D) P\(PXY,D) 

hm — =2 - hm — =^(1 - or), 

p^0 h(p) p^O h(p) 

which implies that 



lim 



R.mm,l(PXY,D) q 



(4.15) 



Ri+R 2 q 
For any L > 0, we can always find a small enough q > such 
that q/q > L + 1. Due to ( 14.141 ) and ( 14.151 ), there exists p > 
such that R\/R 2 < l/L and R sum ^l(R\ + R 2 ) > L. ■ 
Remark 2: The convergence of the limit analyzed in 
Lemma |2] is actually slow, because the logarithm function 
increases to infinity slowly. The consequence is that if one 
chooses a small q to get R SU m,\l(R\ + Ri) close to the limit 
q/q, then p needs to be very small. For example, when 
q = 1/10, a 0e = 1/2, q/q = 9, with p = lO" 200 , we get 
Rsum.i/R* i ~ 8.16. This, however, does not mean that the 
benefit of multiple messages only occurs in extreme cases. In 
numerical computations we have observed that for the erasure 
distortion, the gain for certain asymmetric sources can be much 
more than that for the DSBS example analyzed in this paper. 
The DSBS example was chosen in this paper only because it 
is easy to analyze. 
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